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Abstract 

We introduce a stochastic particle system that corresponds to the Fokker-Planck 
equation with decay in the many-particles limit, and study its large deviations. We 
show that the large-deviation rate functional corresponds to an energy-dissipation 
functional in a Gamma-convergence sense. Moreover, we prove that the resulting 
functional, which involves entropic terms and the Wasserstein metric, is again a 
variational formulation for the Fokker-Planck equation with decay. 
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1 Introduction 

1.1 On the origin of Wasserstein gradient flows 

Since the introduction of the Wasserstein gradient flows in 1997-8 [JK()97llTK098ll()tt98a"l 
IQttOl] it has become clear that a very large number of well-known parabolic partial dif- 
ferential equations and other evolutionary systems can be written as gradient flows. Ex- 
amples of these are nonlinear drift-diffusion equations |Agu05| , diffusion- drift equations 
with nonlocal interactions |CMV03] . higher-order parabolic equations |Ott98b[ IGO01[ 
IGla031 IMMS091 IGST09] . moving-boundary problems |Ott98bl IPPIU] . and chemical re- 
actions |Miellj . The parallel development of rate- independent systems introduced similar 
variational structures for friction [EM06j. delamination [KMR06], plasticity |Mie04] . phase 
transformations [M TL02j . hysteresis |MT04j . and various other phenomena. Further gen- 
eralisations are suggested by taking limits of gradient flows, as in the case of Kramers' 
equation for chemical reactions [AMP + lI] . 

This multitude of gradient-flow structures does raise questions. Before 1997, for in- 
stance, it was widely believed that convection-diffusion equations could not be gradient 
flows. This belief was contradicted by |JK097[ IJK098] ; apparently the question 'which 
systems can be gradient flows' is a non-trivial one. As another example, common build- 
ing blocks of these gradient-flow structures, such as the Wasserstein metric, appear to be 
mathematical, non-physical constructs — can one give these an interpretation in terms of 
physics, chemistry, or other modelling contexts? 

In |ADPZ10] the authors give a suggestion for an organising principle behind the ob- 
served variety in systems and gradient flows. For the example of the entropy- Wasserstein 
gradient flow (see below) they show how the gradient-flow structure itself is closely re- 
lated to the probabilistic structure of a system of stochastic particles. This connection 
explains many aspects of the gradient flow, such as the origin of both the entropy and the 
Wasserstein metric and the interpretation of the discrete-time approximation. 

The result of [ADPZ10] also suggests that this connection between gradient-flow struc- 
tures and stochastic particle systems may be much more general. In this paper we explore 
this idea for the following diffusion equation with convection and decay: 



with \& G C%(M) and A > 0. We contribute two main results to the theory of this type of 
equations: first, we derive a new gradient-flow formulation for equation ([Q), and secondly, 
since this formulation is constructed along the lines of [ADPZ10], we automatically connect 
this gradient flow to microscopic systems of diffusing particles, and show that the gradient- 
flow structure arises from the probabilistic structure of these particle systems. 

The paper is organised as follows. In the remainder of this introductory section we 
develop the required concepts and formulate the main aim of this paper in a little more 
detail. Next, we recall the central notions of this paper in Section |2j We proceed with 
our microscopic models and the corresponding results in Sections |3] and HJ and we wrap 
up with a general discussion in Section In the Appendix we give a description and the 
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proof of an existing large-deviation result in a language that is more suited to this paper, 
and prove an a priori estimate that will be needed in Section HJ 



1.2 Variational formulations 

In this paper we study iterative variational schemes on some space X of the form 

Given p fc_1 , choose p k £ argmin K h (p; p^ 1 ), (2) 

pax 

which will approximate the solution of an evolution equation as h — > 0. The following 
examples illustrate the main ideas. 



Example 1: Hilbert-space gradient flows. If X is a Hilbert space and the functional 
K h is of the form 

K h (p;p)=S(p) + ^\\p-p\\ 2 (3) 

for some smooth functional £ , then the minimisation problem fl2]) gives the stationarity 
condition 

^'f 1 =-grad£(p fc ). 

In this one recognises the backward Euler approximation of the continuous-time gradient 
flow: 

d t p = -grad£(p). (4) 

The time-discrete variational form ([3]) illustrates how in gradient flows the evolution 
is driven by a trade-off between two competing effects. An energy functional £ : X — > 
M. drives the system towards lower values of the energy; at the same time a dissipation 
mechanism (here quantified by the norm || • ||) acts as a selection principle among all 
directions that decrease £. 

If one chooses X = L 2 (R) and £(p) = j\d x p\ 2 , then (TJ]) simply becomes the diffusion 
equation. However, it is not possible to describe convection in this way. The next example 
shows that convection-diffusion equations are nevertheless gradient flows, in a more general 
context. 



Example 2: Wasserstein gradient flows. Instead of a Hilbert space, we now consider 
the metric space X = ^(K) of probability measures with finite second moment, equipped 
with the Wasserstein metric d (see Section [27Tj) . Similarly to ([3]), let (where the subscript 
FP stands for 'Fokker-Planck'): 

K h FP (p;p) := \£{p) - \£{p) + ^rd(p,p) 2 , (5) 
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where £(p) = S(p) + E(p) is the Helmholtz free energy, and 




log f{y)p{dy), if p{dy) = f(y)dy, 

(6) 

oo, otherwise, 
V{y)p(dy). 

are the (negative) Gibbs-Boltzmann entropy and the energy arising from a potential 
Note that in comparison to ([3]) we have subtracted the free energy of the previous state, 
and multiplied the expression by 1/2. Both are done in view of the connection to large- 
deviation rate functionals that we establish below; of course neither change affects the 
minimisation properties of Kp P ( ■ ; p). 

It was first observed by Jordan, Kinderlehrer and Otto [JK0971 IJK098] that the time- 
discrete process defined by (j2j) and (jSJ) converges to the solution of the Fokker-Planck 
equation: 

d t u = dyyU + d y (u djS) , in R x (0, oo). (7) 

Thus, in the same sense as the previous example, the Fokker-Planck equation is a gradient 
flow of free energy with respect to the Wasserstein metric. 

For future reference, we duplicate their main theorem here (where the superscript a 
denotes absolutely continuous): 

Theorem 1 ( |JKU98j ). Let p° G W> an d define the sequence {p h ' k }k>o by: 

p h ' k e argmin^ P (p;p h ' fc - 1 ), k > 1. 

peV 2 (R) 

These minimisers exist uniquely, and for all t > 0, when h — > the function p^L'/^J 
converges weakly in L : (IR) to the solution of (J7|) with initial condition p°. 

While various generalisations of Hilbert-space gradient flows were known for some 
time [ATW931 IDG06} ILS95] . this result meant a breakthrough by extending the concept to 
a large and important class of evolution equations. In addition to inspiring a great amount 
of research into gradient flows in Wasserstein spaces and in general metric spaces, in a 
variety of functional-analytic settings |Utt01j IMie05j IAGS08[ RMS08J, it also gave rise to 
many fruitful connections between partial differential equations, optimal transport theory, 
geometry, functional inequalities, and probability; see |Vil03[ IVil09] for an overview. 

Example 3: exponential decay. As in some other cases [ATW93, LS95J, it will be 
useful to consider more general time-discrete constructions, namely of the form 

K h (a;a) = £(a;a) + f h (a, a), (8) 
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for some function / . In this example, fix some < r h < 1 and let the state space be 
X = M + . Take for £ a mixing entropy with parameter a, 

£(a; a) := a log a + (a — a) log (a — a), for < a < a, (9) 

and for / ft the expression 

f h (a,a) := -alogr h - (a - a) log(l - r fe ). (10) 

Then, the unique minimiser of §8§ is a = r h a. While this construction may appear to be 
a convoluted way of arriving at this result, in fact it appears naturally in the context of a 
specific stochastic system of particles, as we show below. In the limit h — > it will describe 
the term —Am in (pQ) which is associated with decay, as is illustrated by the following simple 
result: 

Theorem 2. Let K h be given as in (EH22P with r h := e~ Xh . Let a G M + be fixed and 
define the sequence {a h ' k } k > by 

a h '° = a , 

a h ' k G argmin K h {a; a^" 1 ), k > 1. 

aGK+ 

Then as h — > £/ie function t h-> a ft 'L*/ ft J converges in time to the solution t h-> a°e _At o/ 
9 t w = —Am. 

The proof follows from remarking that a' 1 ' = a°e~ Xkh . 

Below we will consider this construction in integrated form: 

K h Dc (p;p) := -S(p) + S(p) + S(p-p)-\p\\ogr h +\p-p\\og(l- r h ) (11) 

(the subscript Dc stands for 'Decay equation') on the space of non-negative Borel measures 
A^ + (M) with the total variation norm |p| := p(M). Observe that compared to flS HTU]) . we 
have an additional term —S(j>). This term does not influence the minimiser, but we have 
added it here to ensure that the minimum is 0, which will be needed below. 

Synthesis of examples 2 and 3. In the results that we prove in this paper, the last 
two examples can be recognised separately in a single variational scheme. In the simplest 
case, for instance, where \l/ = 0, the discrete algorithm approximating ([1]) becomes 

p fc Gargmin inf ~\S{p + p ND ) - \S{p k - 1 ) + ±d(p + p ND , p^ 1 ) 2 

peM+(R) PND-\p+PND\ = \p k 1 

+ S(p) + S(p ND ) - |p| \ogr h - \p ND \ log(l - r h ). (12) 

To interpret the formula above, one should realise that the infimum over the measure 
Pnd in the formula above represents a choice: in each time step, the system designates a 
portion p^p > for decay (the index ND stands for 'Normal to Decayed'), while the other 
part p > remains 'normal'. 
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The terms inside the infimum can be written as Kp P (p + p ND ; p k ~ l ) + K^ c (p; p + Pnd), 
and one can understand the structure of (1121) through this splitting. The functional 
Kfp(.P + Pnd] P k ~ l ) characterises a single time-step of diffusion of p fc_1 , according to The- 
orem [lj Decay is left out of this step, since the joint mass p + Pnd is independent of the 
distribution over normal (p) and decayed matter (pnd)- In a second step, given a choice 
for p + Pnd, the second functional K^ c (p; p + Pnd) describes how the total mass p + Pnd 
is divided over p and pnd, according to Theorem [2J As such, we can interpret p + p ND as 
an intermediate state between p k ~ l and p. 



1.3 From microscopic model to large deviations 

We claimed above that the approximation scheme arises naturally in the context of stochas- 
tic particle systems. We now describe this context. It is well known (going back at least 
to Einstein [Ein05] ) that the diffusion equation 

d t u = dyyU, in M x (0, oo), (13) 

is the macroscopic (hydrodynamic, continuum) limit of a wide range of stochastic particle 
systems [DMP91J. Here we focus on one such system, composed of independent Brownian 
particles. 

More specifically, let all particles l,...,n be initially distributed according to some 
fixed p e "P(IR), and, for a fixed time interval h > 0, let each particle % = 1, . . . , n move to 
a new position Y^ n , where the probability of moving from x to y is given by the density 

^~ i):= ^r xp (-T> (14) 

The empirical measure L\ := rT x Y^=i 8 Y h then is a random probability measure that 
describes the distribution of all n particles in space at time h. This measure converges (as 
n — > oo) to p* 6 h , the solution of ([TBI at time h with initial condition p. 

The speed of this convergence is characterised by a large- deviation principle, which we 
discuss in Section 1X21 It states that the probability of finding L\ close to some p G V(M) 
converges exponentially to zero with rate nJ^j(p;p) (the subscript stands for 'Diffusion 
equation'): 

Prob(L^ ps p) ~ exp(— nJ^(p; p)) as n —¥ oo. 

The rate functional J^J-;p) is non- negative and minimised by the solution of ( TT3"|) at 
time h. 

1.4 From large deviations to Wasserstein gradient flow 

When restricting ourselves to the diffusion equation fTl3|) . the gradient-flow functional ()5]) 
reduces to 

K h Df (p;p) := \S{p) - \S{p) + ^d(p,p) 2 . (15) 



6 



In [ADPZ10] the authors show that not only the minimisers of and have the same 
limit, but the two are in fact strongly related: 

Theorem 3 ( |ADP Z10j). Fix an L > 0. There exists a < 5 < 1 such that for all 

peA s nC([o,L}), 

JU ■ ; P) " ^fo ■ ? ^ HO - ^(P) = K h Df (-;p)- ^d(p } • ) 2 (16) 

in the set 

A s := jpG L°°(0,L) : jf p=l and ||p - IT 1 ^ < 6 J , (17) 

equipped with the narrow topology, defined by duality with continuous bounded functions. 
(See Section 0OI for the notion of T- convergence.) 

Note that the term — (4h)~ l d(~p, ■ ) 2 appears on both sides of (jT6l) . The role of this term 
is to compensate the singular behaviour of both and in the limit h — > 0. Morally, 
Theorem [3] states that 

as/i^O, J h Df (-;p)^K h Df (-;p). 

This connection shows how the functional K^j-, which defines the time-discretised gradient 
flow, can be interpreted physically: as the large-deviation rate functional of the microscopic 
model. 



1.5 Overview of this work 

In this article we extend the results of [ADPZ1Q] to equation ([1]). The main results mirror 
those of Theorems [1] and [31 We divide the arguments, and the paper, into two parts. 

In the first part we discuss diffusion with drift but without decay (\l/^0,A = 0in([T])). 
First we construct a system of Brownian particles with drift that models the Fokker-Planck 
equation (j7|), and then derive a corresponding large-deviation principle. In our first main 
result, Theorem 0, we show that for small times the large-deviation rate functional of the 
micro model relates to Kp P in the same sense as in Theorem [3j Note that the expression 
for the gradient-flow functional Kp P is already known from [JK098J; the novelty of the 
current result lies in the connection to the microscopic particle system. 

The second part of the paper concerns the diffusion equation with decay (A > 0, and 
for ease of notation we first take ^ = 0): 

d t u = dyyU — Am, in K x (0, oo). (18) 

Again, we devise a particle system that models this equation microscopically, and derive 
a corresponding large-deviation principle. In the second main result of this paper, The- 
orem O we show that the large-deviation rate functional relates to an energy-dissipation 
functional (1601) in the same way as in Theorem [31 Finally, in Theorem [7] we show that the 
minimisers of this new functional indeed approximate the solution of (fl8|) in the sense of 
Theorem [TJ In this case, the novelty lies in both the expression of the energy-dissipation 
functional, and in its connection to the microscopic system. 



7 



2 Background 



2.1 Wasserstein distance 

In the Kantorovich formulation of the optimal transport problem, a transport plan between 
two measures p, p G V(M) is a probability measure in the set 

T(fi,p) := {g G M + (R x R) : for all Borel sets Bel, 

q(B xR) = p(B) and q(R x B) = p(-B)}. (19) 

In the particular case of the 2- Wasserstein distance (henceforth simply called the Wasser- 
stein distance), the unit cost of transporting an infinitesimal mass from position x to y is 
taken to be \x — y\ 2 . One can then ask for the optimal transport plan that transports all 
mass from a probability measure p to another measure p. The minimum cost 



d(p,p) := min // \x — y\ q(dxdy) 
<?ero, P ) J J 

defines a metric on the space of probability measures with finite second moments (i.e. 
J \x\ 2 dp < oo) [GS84j . and is called the Wasserstein distance. The convergence induced 
by the Wasserstein metric coincides with narrow convergence, defined by duality with 
continuous bounded functions, augmented with the convergence of second moments [Vil03| 
Th. 7.12]. 

Observe that the definition implies 

d(pi+ p2, P3+ Pif < d(pi,p 3 ) 2 + d(p 2 ,p4,) 2 for all Pi, 2 ,3,4 with \p x \ = \p 3 \ and \p 2 \ = \p 4 \. 

(20) 

This property will be used later in the article. 



2.2 Large deviations 

Recall from the law of large numbers that with probability 1, in the large-n limit the 
expectation KL^ is the only event that occurs. In this limit, any other event is considered 
a large deviation from this expected behaviour. A large-deviation principle characterises 
the unlikeliness of such event by the speed of convergence of its probability to 0. 

To illustrate this, we briefly switch to a more abstract notation. We say that a se- 
quence X n of random variables with variables in X satisfies the large- deviation principle 
with speed n and rate functional J : X — > [0, oo] whenever: 

1. J is not identically oo, and J _1 [0, c] is compact for all c < oo; 

2. liminfn^oo - logProb(X n G U) > — inf xe u J(x) for all open sets U C X; 

3. limsup n _ >00 - logProb(X n G C) < — inf pg <7 J(x) for all closed sets C C X. 
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The rate functional J is non-negative and achieves its minimum of zero at the most 
probable behaviour of X n . The right-hand infimum reflects the general principle that "any 
large deviation is done in the least unlikely of all the unlikely ways" (dHOOl p. 10]. A 
related mathematical result is the contraction principle [DZ87, Th. 4.2.1], which states 
the following. Let p : X — > y be a continuous map, and Y n := p(X n ) the corresponding 
random variables. Then Y n satisfies a large-deviation principle similar to the one above, 
with rate functional 

J p (y) ■= inf J(x). 

x&X:p(x)=y 

This contraction principle will be used throughout this paper. For instance, it explains the 
role of the minimisation in (1T2"]) . 

2.3 Gamma- convergence 

Gamma-convergence |DM93j is a notion of convergence of functionals that is appropriate 
for the study of sequences of minimisation problems. It is also a valuable tool in the 
study of large deviations (cf. jADPZlOl Lem. 2]) and gradient flows (cf. |DG061 ISS03] ). 
Moreover, in [Leo05j ■ Gamma-convergence is used to connect large deviations to optimal 
transport. In this sense it provides us a natural notion of convergence for the purpose of 
this study. 

If Q is a first-countable (e.g. metrisable) topological space, and {F h }h is a sequence 
of functionals on that space, then we say that F h Gamma-converges to some functional 
F° : Q — )■ IR U oo as /i — > whenever 

1. (Lower bound) For any sequence p h h ^°> p in Q there holds 

liminf F h (p h ) > F(p); 

/i-»0 

2. (Recovery sequence) For all p G Q there is a sequence p h ~*°> p in Q such that 

limsupFV) < F(p). 

One important implication of this definition is that if each F h has a minimiser p h , and 
p h — > p in Q, then p is a minimiser of F. 

Note that this concept of convergence depends on the topology of the space Q. In this 
paper, unless indicated otherwise, we use the weak-* topology on the space of finite Borel 
measures, defined by duality with bounded and continuous functions. We use — ^ to indicate 
convergence in this topology, and — > to indicate the corresponding Gamma-convergence of 
functionals on that space. 
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3 Diffusion with drift 



In this section we discuss the case of diffusion with drift but without decay ^ 0, A = 0), 
i.e. equation First we describe the particle system that we use as a microscopic model 
for this equation, and derive the corresponding large-deviation principle. Next, we show 
that the large-deviation rate functional relates to the energy-dissipation functional (EJ) in 
a Gamma-convergence sense. 

3.1 Microscopic model 

Consider a system of n independent (i.e. non-interacting) point particles in K. We wish 
p G V(M.) to represent the distribution of initial positions, and implement this as in |Leo07j . 
For each n choose Xi >n G R, 1 < i < n such that 

1 - 

— S Xin — ^p as n — )• oo. (21) 

i=i 

We then set the (deterministic) initial position of particle i G {1, . . . , n} to be3 x nyi . 

The dynamics of the system is determined by the probability for particle i to move 
from Xi tn to a (random) position Y^ n in some fixed time h > 0. We take this transition 
probability to be the fundamental solution rf(y;x) (see Definition H] below) of the drift- 
diffusion equation ([7j). More precisely, 

Definition 4. We say that a mapping 77 : R x [0, 00) — > V(M.) is a fundamental solution 
of the Fokker-Planck equation (J7J) whenever 

1. 1 1 — y r] x,t (B) is locally Lebesgue integrable for all Borel sets BcK and all x G R ; and 

2. for all (j) G C 6 2,1 ([0, 00) x R) and (x,T) 6lx[0, 00) there holds: 

J (d t <P + d yy ct> - d y *d y( j)) r, x '\dy) dt = J <f>{y, T) V x ' T (dy) - 0(x, 0). 

If we assume that \l/ G Cf (M), then there exists an absolutely continuous fundamental 
solution with a density in C 2,1 (lR x (0, 00)) [Fri64l Th. 1.10]. We can thus identify this 
fundamental solution r] x,t with its density r/( • ; x). 

Using this fundamental solution as the transition probability, the empirical measure 
L h n = n^ 1 Ym=i ^Y h wm converge to p * 77 , which is the solution to (|7|) at time h with 
initial condition p. In this sense the proposed system is indeed a microscopic precursor of 
this equation. 

1 This way of enforcing the initial distribution p is different from the approach of [ADPZ10 . It provides 
a more direct result, and is easier to interpret; see Remark [HI for a discussion. 



10 



3.2 From large deviations to Wasserstein gradient flow 

The sequence L\ satisfies a large-deviation principle with rate n and rate functional (see 
Corollary [TBI in the Appendix): 

J H FP (P;P) ■■= mf Mq\p V h ), (22) 

ger(p,p) 

where by abuse of notation we write (pi] h )(dx dy) = ~p(x)r] h (y; x) dx dy. Here the relative 
entropy between two non-negative Borel measures q, q° on R x R is defined as: 

n(q\q°) := UJ^^^dy), if q « q°, ^ 
[ oo, otherwise. 

In analogy to Theorem[3]we prove the following relationship between this rate functional 
Jp P and the gradient-flow functional K FP (given by (E])): 

Theorem 5. Assume \l/ G Cf(R). Fix L > and choose < 5 < 1 by Theorem^ For all 

peA s nC([0,L)), 



J h FP { ■ ; P) - ^d(p, ■ f \S{-) - \S{p) + \E{.) - lE(p), 



, (24) 
K FP (.;p)--d(p,-f 



in As, equipped with the narrow topology. 



The proof relies heavily on an estimate of the fundamental solution rf 1 . To explain this 
estimate morally, observe that if \1/ is afline, i.e. \l/(x) = cx, then the force field d y ^f is 
homogeneous, leading to constant drift c. In this simple case, the fundamental solution 
can be written explicitly: 

rf{y- x) = -^L e -\y-(*-*)\ 2 /v = e\y - x ) e -f ^+^-1^, (25) 
\fAnt 

where 6 t is again the diffusion kernel (113]). Although for an arbitrary \1/ an analytic expres- 
sion for the fundamental solution is generally difficult to find, the expression ( 125]) above 
suggests that it can be estimated by something similar for small times. Indeed, in Theo- 
rem [15] in the appendix, we prove that there exist /?o,/?i £ R such that for all t > and 
almost every x, y 6 R: 

0t( y - x ) e -\nv)+\nx)+P°t < x) < 9 t( y _ a;) e -5*(»)+|*(*)+At. (26) 

Observe that the factors 1/2 in the exponent correspond to the factors 1/2 of the energy 
in expression ([5]). We are now ready to prove the Gamma-convergence result. 
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Proof of Theorem^ To prove the lower bound, take any sequence p h — p in Ag, and 
calculate 



liminf J h FP {p h ]p) - ^d(p,p h 



^ liminf inf U{q\pn h ) - ±d{p,f* 
h-^o qer(p,p h ) 



J26t 

> liminf inf U(q\p9 h ) 

h-+0 q£T(p,p h ) 



= liminf inf ft(g|p0 h ) - ^(p, p h ) 2 + \E(p h ) - \E{p) - 1 h 

h^O q£F(p,p h ) 

> _ i S (p) + \E{p) - \E{p), 

where the last inequality follows from Theorem [3] and the (narrow) continuity of p i— > E(p). 

To construct a recovery sequence, we fix a p G and take a recovery sequence p h — ^ p 
in as provided by Theorem [31 Then similarly: 

lim sup 4 P (p h ;p) - ^dfop*) 2 ® limsup inf W(g|^*) - ^(p,/) 2 

h->0 /i->0 96r(p,p ft ) 

< limsup inf n{ q \-pe h )-^d{-p,p h f+\E{p h )-\E{p)-^h 

□ 



4 Diffusion with drift and decay 

In this section we discuss the case of diffusion with decay. For brevity, we first consider 
the case without drift = 0, A > 0). First we describe the particle system that we use 
as a microscopic model for this equation, and calculate the corresponding large-deviation 
principle. We proceed with the main results for this equation: Gamma-convergence to an 
energy-dissipation functional, and convergence of the approximation scheme to the solution 
of the diffusion-decay equation. Finally, we discuss how the system can be generalised to 
include drift, and how the decay can be generalised to diffusion- reaction equations. 

4.1 Microscopic model 

In contrast to the case without decay, the diffusion-decay equation f JT8|) is not mass- 
conserving, implying that the Wasserstein distance between two time instances of a solution 
is not defined. To overcome this difficulty, we assume that all decayed matter continues 
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to exist after its decay, but in a different form. We thus distinguish between normal, 
non-decayed matter, denoted by N, and decayed or dark matter, denoted by D. 

The microscopic model now consists of a finite number n of independent non-interacting 
point particles moving in R x {iV, D}. Similarly to the non-decaying model, we fix an initial 
distribution p G V(M. x {iV, D}) and initial positions x^ n G R and states /ij jn G {N, D} 
such that: 



n 
1=1 



Pn 



and 



n 

- y s x . 

n ^ v 

i=l 



Pd 



as n 



oo. 



For the dynamics of the system we assume that the motion of all particles in R is inde- 
pendent of their motion in {A^, D}. We take the motion in R during some fixed time step 
h > to be Brownian, ie. governed by the transition probability 9 h from ( I14p . For the 
motion in {A^, D}, we assume that the time after which a particle changes from A" to D is 
exponentially distributed with rate A. Since decay is a one-way street, the probability for 
a particle to change back from D to A^ is zero. This results in a probability for a particle 
to change from state /i to v during the time step h of 



e~ Xh , 


/' 


= N,u = 


N 


1 - e- Xh , 


/' 


= N,v = 


D 


o, 


/' 


= D,v = 


N 


1- 


/' 


= D,v = 


D 



Denote L h n := n 1 Y^i=i ${Y h ,u h )> where Y^ n G K and v^ n G {N,D} are the random 
position and state of the i th particle at time h. Indeed, converges to the solution at 
time h of the system 

{dtUN = dyyUN — Awat, R x (0, oo), 
d t u D = d yy u D + Xu N , R x (0, oo) 

with initial condition (pn^d)- ^ n this sense, the thus defined particle system is a micro- 
scopic interpretation of the diffusion-decay equation ( TT8l) (if we ignore the dark matter). 



(27) 



4.2 Large deviations to gradient flow to PDE 

While the inspiration for this paper was equation ((T]), the construction above suggests to 
consider not only (TjQ) but also the augmented system of equations f l27|) (and its extensions 
to non-zero \P). For this reason we derive a large-deviation principle and a correspond- 
ing energy-dissipation functional for this system, and afterwards simplify by contraction, 
leading to results for ([T]). 

Let M% := n~ x Yui=i ^ ™,w nX h ,f!> ) ^ e ^ ne empirical measure of the initial and final 
configurations corresponding to the particle system defined above. Then (see Theorem [T2]) 
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the sequence satisfies a large-deviation principle in V(M. x {N, D} xRx {N, D}) with 
rate n and rate functional 

E ^MpA^ if • x {iV} x M x {N, D}) = p N (.) 

and q( • x {D} x R x {N, D}) = p D (-), 



li=N,D 
u=N,D 

OO, 



otherwise, 



writing q fliy (dxdy) = q(dx x {p} x dy x {z/}). We note that definitions §6§ and (1231) indeed 
allow for non-negative Borel measures that are not necessarily probability measures. 

In contrast to the previous case without decay, the special structure of the decay forces 
us to keep track of more information: not only of the total amount of dark matter, but 
of both the pre-existing dark matter and the normal matter that is converted to dark 
matter in the present time step, separately. We thus obtain a large-deviation principle 
for the triple empirical measures ~ Y^=i „ Y h v h ) with rate n and rate functional (the 
subscript stands for 'Diffusion equation with Decay') 



JdjdApnn, Pnd, Pdd', Pn, Pd) 



inf 



E 

[iu=NN ,ND ,DD 



inf U{q, v \-p^ v e h ) : 



Pnn, Pnd £ M + (R) such that p NN + p ND = p N j. (28) 

Here p^ v is the final-time matter of type v that was initially of type /i, and similarly ~p is 
that part of the initial distribution p^ that will become of type v at time h (see Figured]). 
Observe that the term "H(grav|0) is zero if and only if q DN = Oae, and oo otherwise; indeed 
no mass is allowed to change from D to N. Hence we omit the dependency on Pdn- 



Pn< 



Pd 



Pnn 


Qnn 


Pnn 


Qnd 


Pnd 


Pnd 


Qdd 


Pdd 


Pdd 





Pn 



>Pd 



Figure 1: Notation for the various measures in the diffusion-decay equation. The measures 
q^ u are pair (coupled) measures, with first and second marginals indicated to the left and 
right of the arrows. The various marginals and p^ combine as indicated to form the 
observed normal {j) N and pn) and dark matter (~p D and po) at the initial and final times. 



Theorem below shows that for small h we have J%^d c ~ Kdjdci where 

{.Pnn, Pnd, Pdd', Pn, Pd) := — \S(pNN + Pnd) — ^S(Pn) + ~h^d(p N , Pnn + Pnd 

DD 

nn) 



^DfDc 



+ \S(pdd) - \S{p D ) + ^d(p D ,p DD ) 2 
+ S(pnn) + S(pnd) - \pnn\ logr NN - \ p ND \ log r^jj. 



(29) 
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Let the admissible sets be: 

B° 5 := {(7Wd) G + W H C([0,L])) 2 : p^ + p D G A a } ; 

Bs(Pn,Pd) '■= \{Pnn, Pnd, Pdd) G .M + (R) 3 : ]=^[(paqv + Pm?) G A s and j^Pdd G A 5 | , 

both equipped with the product narrow topology. We remark that A5 contains only 
densities of mass 1, so that (pnn, Pnd, Pdd) G B s (p N ,p D ) implies that \p N \ = \p NN + 
Pm?| and \p D \ = \p DD \. 

Theorem 6. Fix L > and choose < <5 < 1 by Theorem^ Then for all {p°n,Pd) G B®, 



1 .._ . 2 1 

+ I -jvd| log r^ D + I -aw I logr^v 



^DfDci'NN, 'ND, 'DD] Pni Pd) ~~ ~^(Pn ) "AW + 'iVI)) 2 — Tl^Pd , '-DD 



^ Pd \ — \S(-nn + -jvd) — \S{Pn) + I'S'('dd) — \S(p D ) + S{-nn) + S(-nd)- 

(30) 

Note that we have not only subtracted three singular terms from Jpf Dc , analogously to 
Theorems [3] and El but also the /i-order term — | -jvzvl logr^; the latter is for reasons of 
symmetry and to simplify calculations. 

Finally, we show that the functional K D j Dc in f[2§j) indeed defines a variational formu- 
lation of the diffusion-decay equation (|18p . In view of completeness, and of generalisations 
to diffusion-reaction equations that we will discuss in Section I4.5[ we prove convergence 
of the full scheme, including the dark matter, to the system of equations (|27j) . We then 
derive the corresponding result for the single diffusion-decay equation (fl8l) by minimising 
over the dark matter (see Remark [8] below), a procedure essentially the same as the con- 
traction principle (Section 12. 2p . Because we keep track of the dark matter, the matter 
that decays in a time step should be added to the dark matter already present from the 
previous iteration. 

Theorem 7. Let p° G an( ^ define the sequence {(p^ fc , Pr) k )}k>o by: 

(PaAp£°) = (/A0), 
and for k > 1 : 

i h,k h,k h,k\ _ • T ^h I h,k—l h,k—X\ /oi ^ 

(PnniPndiPdd) G argmm K DfDc {p NN , p ND , p DD ; p N , p D ), (31a) 

PNN+PND +PDD EV$ (M) 

/ h,k h,k\ I h,k h,k , h,k\ /oiu\ 

[Pn > Pd ) = (Pnni Pnd + Pdd)- ( 31b ) 

These minimisers exist uniquely, and for all t > 0, when h — > the pair (p^^^p^*^) 
converges weakly in L 1 (IR) x L X (]R) to the solution of (J27J) with initial condition (p°,0). 
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The proof of this theorem is based on [JKU98j . and can easily be extended to M. d , and 
to an additional drift term \1/ (see Section 1^3]) . Note that when we let A — > then \pnd\ 
should vanish in fl29|) to prevent blow-up; indeed, in that case 

K DfDc(PNN, 0, p DD ; p^" 1 , p^ 1 ) = Kp f (p NN ; p k N x ) + K Df (p DD ; p^ 1 ). 

Remark 8. A further contraction can be used to ignore the dark matter. We can then 
ignore the initial dark matter as well, so that the sequence - ■ ,a - N b~v h satisfies a 

n — ■ i,n i,n 

large-deviation principle with rate n and rate functional 

Pn^ inf inf "H(gMv|p~MV?MV^)- 

USPnn<Pn q&F(p NN ,PN) 
\Pnn\ = \Pn\ 

The corresponding energy-dissipation functional is then: 

K DfDc(pN;p N ) ■= inf . -\S{p N + p ND )-\S{-p N ) + j R d(-p N) p N + p ND f 

Pnd-\Pn+Pnd\ = \Pn\ 

+ S(p N ) + S(p ND ) - \p N \ hgr^ N - \p ND \ hgr% D , (32) 
which matches the minimisation problem (fT2"|) . The corresponding version of Theorem [7] is 
Theorem 9. Let p° G V% (K.) and define the sequence {p h jf}k>o by p^° = p° and for k > 1 

h,k _ • TT^ 1 I h,k—l\ 

Pri G arg mm K DfDc (p;p^ ). 
P eM+(R) 

These minimisers exist uniquely, and for all t > 0, when h — > the function p^*^ 
converges weakly in L 1 (M) to the solution of ( ITS]) with initial condition p° . 

□ 

Remark 10. If we restrict ourselves to measures of mass \p^\ = t~nn\Pn\-> thus excluding 
the possible fluctuation in the decay process, then (I3"2"|) further reduces to 

Pn H> lS(p^p N ) - %S(p N ) + ^d(p N , -^p N ) 2 . 

Z NN Z NN 

A similar scheme to deal with decaying mass can be found in |KW99j . □ 
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4.3 Proof of Theorem [6] 

To reduce clutter we abbreviate pm '■= Paw + Pnd and (Jnt '■= Qnn + Qnd- The sum over 
pv = NN, ND in J D f Dc can be rewritten as: 



inf 

Pnn+Pnd=Pn 

v=N,D 



inf 

Pnn+Pnd = Pn 



_ jrf . ^Wa^*) 



inf 

qNv&(~p Nv ,p Nv ) 



, dqNT dpNu 
log — 



dq 



Nv 



Nv 



dp N 9 h dpNT r 
inf Ti(qNr\PN0 h ) + S(p NN ) + S(p ND ) - >S(Pat) 

cjjvrGr(p iv ,pjvT) 



dpNT 



dq 



QNu 



NT 



- \p NN \ logr 



h 

NN 



\p ND \ logr ND + _ inf _ inf 

PNN+PND = PN <lNN+qND=qNT 

gjvjvGr(p Jviv ,p J vjv) K 



It ^ 



dp 



A/V 



dp 



-Qnt 



NT 



(33) 



We now show that the last sum vanishes under the infima. Since \qx v 
we can apply Gibbs' inequality for v = N, D: 

dpNu 



\PNi 



U \ q Nl 



dp 



NT 



Qnt > 0. 



;Qnt\ 



(34) 



On the other hand, for any given qwr, the measures 

dpNN 



QNN ■ , 

dpNT 

and their first marginals Pnn(') 
the infima. It follows that 



inf _ inf 

PNN+PND = PN QNN+qND=qNT 

qNN&F(PNN'PNN) V 



QNT-, 

= q NN (- x R) and p ND 
dpNu 



Qnd 



dp 



ND 



dpNT 

Qnd{- x 



Qnt 



are admissible in 



n ( qNl 

M \ 



< ~t~~Qnt \ < (qm 

d PNT J V V 



dp 



ATi/ 



dp 



AT 



0. 



(35) 



inf H(q Nr \p N h ) +n(q DD \p D 9 h ) 

<jjv T er(p iV ,p ] vT) 



Hence we can write: 

Jr>fDc(PNN, Pnd, Pdd] Pn^Pd) 

+ S(p NN ) + S(p ND ) - S(pnt) - | Paw | logr^ - |patd| logr^. (36) 

Take 5 > from Theorem [31 and fix (p N ,Pr>) £ We first prove the lower bound of 
the Gamma- convergence, and then the existence of a recovery sequence. 

Lower Bound. Take any convergent sequence (p NN , p% D , p DD ) — ^ (paw, Pnd, Pdd) in 
Bs(~p N ,p D ). Again, we write p^ = p h NN + p h ND . Combining f )28|) . f l30]) . and ( 136]) . we need 
to prove that: 

liminf inf ?i(q N T\p N 9 h ) - ^-d(p N , p^) 2 

+ inf H^dIp^-M^^W' + ^P^ + ^pW-^Pat) ( 3? ) 

9DDGr(p D ,p5, D ) 

> -kS(pNr) - k s (pN) + \S{pdd) - \S{p D ) + S(p NN ) + S(pn D ). 
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We will prove the lower bound for a number of terms separately. 

• By assumption, |pjvl~ViVT lies in Ag. Although Theorem [3] applies to probability 
measures, it can easily be extended to measures of different mass, so that: 

liminf inf H( qNr \p N 9 h ) - ±-d(p N , Pnt ) 2 > fSW) - ±S(p N ). (38) 

Similarly, IpdI^Pdd £ ^5 and so: 

liminf inf H(q DD \p D d h ) - ±d(ji D , p h DD f > \S{p DD ) - \S{~p D ). (39) 

• Since the function (x, y) h-> x log x+y logy — (x+y) \og(x+y) is convex, the functional 

F : (pnn,Pnd) H> S(p NN ) + S(p ND ) - S(p NN + p ND ) 

is also convex. Moreover, F is continuous in the set {(pnn, Pnd) '■ \Pn\~ 1 (Pnn+Pnd) £ 
As} with the (strong) L 1 -topology, since any sequence is uniformly bounded. Thus 
in this set F is also narrowly lower semicontinuous |Bre87t Corollary III. 8], i.e.: 

liminf S(p h NN ) + S(p h ND ) - S{p h m ) > S(p NN ) + S(p ND ) - S{ PNr ). (40) 

h— >0 

The required lower bound (I3TI) then follows from (1351) . (13T)j) and (14TI]) . 

Recovery Sequence. Fix (pnn, Pnd, Pdd) £ Bs(J>n,Pd) and take a recovery sequence 
p^c, — ^ pijD from Theorem [3] such that 

limsup inf ^(g^|p^)-^(p D ,p^) 2 = |5(p DD )-i5(p D ). (41) 
Analogously, take a recovery sequence p^ — ^ p^v + Pnd for 

limsup inf "H^atIpa^) ~ TT^Pjv, Pat) 2 = ^(paw + Pnd) ~ ^S(p N ). (42) 

Contrary to the case of the lower bound we define p^ and p% D in terms of p^: 
h , s ._ __d£m^_ h h ._ d PND h 

PnN\U) ■— if , \PAT PAD - _ 1/ , \ r AT - 

"(Paw + Pad) "(Paw + Pad) 

If we define the Radon-Nikodym derivatives to be 1 on null sets of pnn + Pad, then clearly 
Pnn + Pnd = Pat and (p% N , p% D , p h DD ) (pnn, Pnd,Pdd)- In fact, this convergence is 
strong in L 1 (0, L) with uniform bounds |ADPZ10~| Lem. 6.4]. From this we infer: 

S(p h NN ) -> S'Cpaw), 5(p^ D ) -)« S(pm?) and S(p^) -> 5( Pat ). (43) 

Then it follows from (|4"T1) . fj4"2"j) . and (T4"3"l) that (p^, p ND , p DD ) is a recovery sequence, ie. 

limsup inf K{qNr\p N 8 h ) ~ Trd(pN, Pnt) 2 

h^O qNT&T(p N ,p% T ) 4/2 

+ inf H(g DD |p D ^)-^(p D ,pU 2 + ^(pW) + ^(PA r D)-^(PAr) ( 44 ) 

< ~lS(p NN + p ND ) - \S(p N ) + \S(p DD ) - \S(p D ) + S(p NN ) + S(p ND ). 
This concludes the proof of Theorem EJ 
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4.4 Proof of Theorem [7] 

Theorem [7] contains two main results: existence and uniqueness of minimisers, and the 
convergence of time-discrete solutions. We first discuss the existence and uniqueness of 
minimisers. By slightly rewriting f )3T|) we can minimise, for fixed (p^ fc_1 , p^ 1 ) G V$(BL), 
the functional 

{Pnni Pnt, Pdd) ^ Kr)fDc(.PNN, Pnt — Pnn, Pdd', Pn \ Pd l ) 
= ~ ^(pnt) - IS^/" 1 ) + ^d(p h /' 1 , Pnt ) 2 

+ S(p NN ) + S(pnt - Pnn) - \pnn\ k>gr NN - \p m - Pnn\ logr^ D . (45) 

The negative sign of the term —\S(pnt) makes this minimisation problem slightly non- 
trivial. We therefore proceed in steps. For fixed pnt, the functional 

F h {p N N) ■= S(pnn) + S(pnt - Pnn) - \pnn\ logr NN - Iptvt - p NN \ logr ND 

is convex and has a unique stationary point that satisfies 

= log pnn - log(p7vr - Pnn) - logr NN + logr^ D , 

implying that pnn '■= t nnPnt * s the unique global minimiser of F. Therefore, at every 
step k, we have (see Figured]) 

h,k h,k h h,k i h,k h h,k ( AP.\ 

Pn ~ Pnn ~ r nnPnt ana Pnd ~ r ndPnt~ l 4D J 
The problem of minimising (JUS]) can now be reduced to the minimisation of 

I \ Tsh I h h h,k—l h,k—l\ 

\Pnt-, Pdd) ^■J^DfDc\ r NNPNT-,r ND p NT - l Pdd] P N iPd ) 
= IS{pnt) - ^(p^" 1 ) + -k d (p H N~ l i Pnt) 2 
+ \S( P dd) - \S{p h ^) + j- h d(p h *-\ p DD )\ (47) 

which consists of two decoupled minimisation problems, for which existence and uniqueness 
of minimisers are proved in |JK098| Prop. 4.1]. 

The compactness of the sequence (p N ^ , p D ) is based on the same principle as 
in |JK098j . but with a twist. The central observation is again that (p^ fc_1 , p h ^ k l ) is 
admissible in ( 14T|) . leading to the estimate 

1 7/ h,k—l h,k\2 i 1 r/ h.k—l h,k\2 ^ o/ h,k\ . ri f h,k—l\ 0/ h,k\ . ri f h,k—l\ / ao\ 

2hd{pri ,Pnt) +2hd{p D ,p£ D ) < -Sip^) + S{px ) - S(p^ D ) + S{p D ' ). (48) 

However, the migration of mass from normal to dark matter means that upon summing 
this estimate over k, terms in the right-hand side do not cancel. Below we establish the a 
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priori estimates 



M 2 (p^ fc + p h D k ) := J \x\ 2 d(p h N k + p h *) < C 



LT/fcJ 



E,f h,k—l h,k\2 i jt h,k—l h.k\2 / r^ii 
d{pN >Pat) +d{PD >Pbd) <Ch, 



(49) 
(50) 



k=l 



where the constant C only depends on the initial data and on the maximal time T. As 
in [JKQ98] these provide the appropriate tightness in space (by fj49|) ) and continuity in 
time (by (1501) ) to conclude that there exists a subsequence such that (p^ , Pd~*^] 
(u N , u D ), weakly in Z/(K x (0,T)) x L X (M x (0,T)). 

We now prove (149!) and (150]) . Recall from [JKQ98] the estimates 

-S(p) < C (M;(p) + if for some < a < 1 and for all p G A1 + (M) 

M 2 ( Pl ) < 2M 2 (p ) + 2d(p ,Pi) 2 for all p , P i G Ai + (K) with |p | = \ Pl \. 

This allows us to estimate, for neN such that n/i < T, 

M 2 (p^ + p£ n ) < 2M 2 (p° w + p° D ) + 2dfr#* + p£", p% + p° D ) 2 . 

The second term above we then estimate by 



i/ h,n . h,n i \2 ^ 

d{pN +Pd iPn + Pd) < 



Ejl h,k . h,k h,k—l . h,k—l\ 
d \PN +Pd >Pn +Pd ) 

-k=l 

n 

^ \ v i/ h,k i h,k h,k—l . h,k—l- 

< n^d{p^ +Pd ,Pn +Pd , 

k=l 

n 

Eii h,k i h,k h,k—l . h,k—l 
d{Prir + Pd D ,Pn + Pd 



J20ll 



k=l 

n 



~^ \ 11 7/ h,k h,k—l\2 i j ( h,k h,k—l\ 

< n^dip^pri ) +d(p^ D ,p^ ) 



We also observe some properties of S: 



S(ap + (3p) = S(ap) + S((3p) — a\p\ lo£ 



k=l 



a 



a + 



0\P\ 



P 



a + p' 

for all a, (3 > and p G Al- 



and in general 



S(pi + p 2 ) < S(pi) + S(p 2 ) - \pi\ log 



iPil 



Pi + p 2 1 



|P2|log 



\P2\ 



Pi + P 2 | 

for any pi, p 2 G A4~ 



(51) 
(52) 



(53) 



(54) 
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The first follows from simple calculation, and the second can be proved by writing p\+p 2 = 
A(pi/A) + (1 — A)(p 2 /(1 ~ A)), applying the convexity of S, and optimising with respect 
to A. Combining these with f H6|) we then have 

(55) 
(56) 



S(Pm>) = s (Pnn) + S(Pnd) ~ \Pnn\ l ogr h NN - \p h /\ logr^, and 



nf h,k\ n I h.k \ 

S{Pd)< S(pri D ) 



of h.k \ I ii. h- i /' \ 

\Pd 



h,k 



h,k I 

Pnd\ 



l h,k i i 



I h,k I 

\Pdd\ 

I h,k\ 

\Pd I 



Now, putting the ingredients together: 

l53t 



M 2 {p h N n + P h 6 n ) < 2M 2 (p% + p D ) + 2d(p h N n + p h D n ,p N + p° D ) 2 

n 

i~i . r. \ * ji h,k—l h,k\2 i if h,k—l h,k\2 

C + 2n2_^d{px ,Pnt) +d{pn , p DD ) 



< 



< 



It55t.lt56ll 

< 



C + AnhJ2 s (PN k ^ ~ S (P H ^) + s (pD k ^-SipDD) 

k=l 
n 

C + ATJ2 Sip*/- 1 ) ~ S(p h /) + S(p h D k - 1 ) - S(p h D k ) 



k=i 



n I h,k I I h.k i 

, ^mV^i h.k 1 1 h i i ?i,fc 1 1 h I h.fc 1 1 Pad i /t.fc 1 1 \Pdd\ 

\Pd I \Pd I 



k=l 



< 



< 



C + 4T 
C + AT 



<0 (see below) 
h,n\ i -i\a i f n /f f h,n\ 



s(A) + s(p°d) + c(M 2 ( P /) + ir + c(M 2 (p^ n ) + iy 

~S(p° N ) + S(p° D ) + 2 a C(M 2 (p/ + p %>) + 2)« . 



Therefore M 2 {p h ^ n + p^™) is bounded on finite time intervals, which proves (H9|) . and the 
boundedness of the second line above implies (1501) . 

The sign of the brace above can be shown as follows: setting r := r^ N and therefore 
by (H6T) . we have 



i Mi 
\Pn I 



\Pd I 



1 -r* 



h,k 



Pnd\ = r 



„fc-i 



i i h,k i i 

and \Pd D \ = 1 — r 



fc-i 



21 



Then 



n I h,k I I h,k I 

E 1^1 lo S^v + IpXI logr^ - \p h N k D \ log ^ - |p*£| log ^ 

IPd I IPd I 



fc=l 



^V fc logr + (r fc 1 — r fe ) log(l - 



r fc 1 - r fc ) lot 



fc=i 



1 — r k 



l-r^llo 



1 — r 



fc-i 



E ^ lo g rfc - log + (! - rfc ) M 1 - rfc ) - (! - rfe_1 ) M 1 



1 — r k 

.k-l\ 



r 



k=l 



r n log r n + (1 - r n ) log(l - r n ) < 0. 



(57) 



This concludes the proof of the compactness and therefore the convergence of a subse- 
quence. 

We now determine the equation satisfied by the time-discrete minimisers using the 
method introduced in [JKQ98j . After perturbing the minimisers p h ^ and p h ^ D by a push- 
forward, we find that for all £ G C£°(R), 

JJ( y ~ x ) £(f ) d y)~ h J Pnt(v) d v£{y) d V = °> 

fj(y~ x } Z(y) VDD(dx dy)-hj p h j£{y) d y £(y) dy = 0, (58) 



where g^vr and are the optimal transport plans in d(p h jf \p^) and rf(p^ K l ,Pdd)- 



in <up v ./> v/ ! axiu a^ fc ' . pj^ 
Using p^ = p^r = r% N pj^ and p^J = r^p^ + p^ D as prescribed by f|31bj) and (gSJ), we 
add up the equations above to find for all £, 



'(y - £(?/) r% N q Nr (dxdy) - hj d y £(y) p h ^ k {dy) = 0, 

Jkv-x) av) {r h NDqNT + q DD ){dx dy)-hj d y t{y) p^ k (dy) = 0. (59) 

As r h NN q m G T(r% N p h /~\p h /) and r h ND q m + q DD G ?(r h ND p h /~ l + pjf ~\p5f ), (although 
the second may not be optimal) we have the following bounds for any £ G C£°(R): 

(pjf - ^nnPn^ 1 ) C ~ JJ{y - x) dyC(y) r h NN q NT {dx dy) 

(C(y) - CO) + (x-y) d y ((y)) r^qwridx dy) 

< | sup \dyyC\ r h NN J J \y - x\ 2 qNr(dx dy) 

= \ SUP \8yyC\ d{p k /~\ P^) 2 , 



22 



and similarly, 

J [Pnf ~ i r NDP k N k ' 1 + Pd^ 1 )) C - JJ(V ~ x) d y C(y) {r h ND qNr + q DD ){dxdy) 

1 I r\ j-\ I ji h,k—l h,k\2 i jl h.k— 1 h,k\2\ 

< ^sup\d yy Q [d{pri ,prir) + d{p^ ,pj D ) J. 
After applying these bounds to the equations (1591) . taking £ = d y (, we find for all (: 



{{P H N ~ r NNp h N X ) C - PN k dyyC) dy 



< ^ SUP \dyy(\d(p h / \P H ^) 2 , 



and 



/( \ I h,k h h,k—l h,k—l\ > h,k a /-\ J 

[h(pD - t ndPn ~Pd )C-Pd dyyCjdy 

^ 1 in >i / j; h.k— 1 h,k\2 , it h,k—l h,k\2 

< ^ sup |<9 ra CI ^(Pjv > Pat) + ^IPd , Pri?) 

Using the convergence of a subsequence (not relabeled) (p N , Pjj ) ~~ ^ ( u n, «d) weakly 
in L X (M x (0,T)) x L X (R x (0,T)), we find that for all C G C%°(R x [0,T]), 




Wjv ( -d t ( + ( lim 



fT 




/i->0 " 
1 ( h,\t/h\ _ h. I h -J 

Pat 



.Jtlh 

Pn 



V> + ft W 



C - P^ LiMJ ^ ra C ) d« dt 



[T/h\ 



< ^ | SUP dyyf^C d(p h / 



k=l 

*P nu n 

< > 0, 



and for the dark matter 



'NT! 



j J (~Ud dtC - (lim r -f^ u N (-u D d yy (j dy dt 



i 




1 I h,[t/h] _ h,[t/h]-l 
h\PD PD 



h ND h,\t/h\-i 

h Pn 



C — p^y dyy( ) dx dt 



[T/fcJ 



< Yl ^ sup d mSo^ ( d (P h N k \ Pnt) 2 + d (p h D \pdd)' 



k=l 

< Ch^O. 

From this we see that the limit {un,ud) indeed solves ( l2Tj) (weakly in L X (R x (0,T))). 
This concludes the proof of Theorem [7J 
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4.5 Drift with decay and reactions 



Diffusion with drift and decay. The results from Sections |3] and H] can be easily com- 
bined in the following way. A microscopic model for the Fokker-Planck equation with decay 
([1]) is obtained by replacing the spatial transition probability 9 h in the micro model from 
Section H~Tl by the fundamental solution rj of the Fokker-Planck equation from Definition|H 
The corresponding large-deviation rate functional then simply becomes (1251) with that tran- 
sition probability. By the same arguments of Theorems and [HI the large-deviation rate 
functional is related to the following energy-dissipation functional in a Gamma-convergence 
sense: 

Kfpdc(Pnn, Pnd, Pdd;~Pn,Pd) '■= ~ ^S(pnn + Pnd) - \S(fi N ) + -^d(p N , p NN + pnd) 2 

+ \S{pdd) - \S{p D ) + ^d{p D ,p DD ) 2 
+ S(pnn) + S(p ND ) - \p NN \ logr NN - |pat D | \ogr% D 
+ \E(p NN + p ND + Pdd) ~ \E(p N + p D ). 

(60) 

Indeed, as our main result this functional defines a variational formulation for the 
Fokker-Planck equation with decay (pQ): 

Theorem 11. Let p° G V$(R) and define the sequence {{p^ , Prf )} k>o by: 
(P^°) = (P», 



and for k > 1 : 



/ h,k h,k h,k\ _ • T y-h I h,k—l h,k—l\ 

VPnniPndiPdd) E argmm K FPDc (p NN , p ND , p DD ; p N , p D ), 

PNN +PND+PDD (R) 
/ h.k h,k\ i h,k h,k . h,k\ 

[Pn ,Pd ) = {Pnn, Pnd + Pdd)- 

These minimisers exist uniquely, and for all t > 0, when h — > the pair (p N , p^ ^) 
converges weakly in L^R) to the solution of (1271) with initial condition (p°,0). 

The proof is a slight adaptation of the proof of Theorem UJ with the observation that 
after perturbing with a push-forward, the continuity equations (158|) include the additional 
terms hfp h ^(y)£(y) ■ d y ^(y)dy and hfp h ^(y)£(y) ■ d y ^(y)dy for the potential energy. 
Following the proof of Theorem [7J these extra terms will result in the convection term in 
equation ([!]). 



Diffusion-reaction equations. Another useful generalisation is a system of equations 
that describe the transition between a set of states v in some index set J: 

d t U V = dyyUy ~ ^ (62) 
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We should then choose the transition probabilities r^ v of the microscopic system in such 

a way that lim^o = and = 1 — z2 u ^^ r ilu- The large- deviation rate functional 
corresponding to this micro model is: 

/iGi — _- v€l 

which Gamma-converges, after subtracting singular terms, to the functional: 

Yl [~ ^ S (^ei Pn») ~ I s W + ih d (Pv"T,uei Pv»Y ' + i S M - M lo § r y ]■ ( 63 ) 
i-iei uei 

In the same way as in Theorem [JJ this functional defines a variational formulation for the 
system of diffusion-reaction equations (1621) . 



5 Discussion 



The work of [ADPZ10] uncovered an intriguing link between the diffusion equation, the 
entropy- Wasserstein gradient-flow formulation of that equation, and a large-deviation prin- 
ciple for a stochastic particle system. The work of the present paper is motivated by the 
question whether this link can be generalised. 

Equation (TjQ) moves beyond [ADPZIOJ in two ways. The additional drift term repre- 
sented by \I> is compatible with the Wasserstein framework. The corresponding equation (J7|) 
is a Wasserstein gradient flow of the free energy functional S + E. In Section [3] we show that 
also the large-deviation connection generalises to this case, with only minor modification. 
Corresponding continuous-time large-deviations results for instance in |DG94j or |FK06[ 
Th. 13.37] mirror this. 

The case of decay is different. The structure of the time-discretised gradient flow in 
Theorem [7J has some non-standard features: 

• The iteration defined in Theorem [JJ is special in that the minimisation is taken over 
the pair (pnn, Pnd), and the result is added to the dark matter of the previous time 
step. Of course, when ignoring the dark matter, as in Remark [HJ this is not visible, 
as is shown in the corresponding definition in Theorem 

• The functional K^ Dc in (|29|) is not that of a 'standard' gradient flow. The discussion 
in Section 11.21 and the proof of Theorem [7J suggests to split it into three parts; two 
parts that represent the diffusion steps for normal and decayed matter, and a third 
part for the decay step. The fact that the operator can be split into terms for each 
driving force is related to the indepence of the processes in the micro model, so 
that the transition probability is a product of two probabilities, which can then be 
split according to calculation (1331 . Pursuing the analogy with the diffusion step, 
and with metric-space gradient flows, one might interpret S(pnn) + ^(pat — Pnn) ~ 



25 



S(Pnt) as the as the driving energy behind the decay, by which the dissipation would 
then become the (linear!) terms — \pnn\ logr^y — \pnd\ ^°S r ND- I n which sense this 
interpretation is meaningful is as yet unknown. 

As for the restrictions in the theorems of this paper, the Gamma-convergence Theo- 
rems and [H] are based on [ADPZ10J, and are thus subject to the same restrictions (one 
dimension and a small L°°-neighbourhood of a constant). Apart from this dependence, we 
see no reasons why Theorems |5] and [6] should be restricted to the 1-dimensional setting, 
bounded domains [0,L], and to the As and B$ sets. In fact, these restrictions are not 
needed in Theorem [3, which can easily be extended to higher dimensions. 

The way we have set up the microscopic model in this paper restricts us to decay 
processes. The reason that we cannot generalise to 'birth' processes (i.e. A < 0) is that, in 
the microscopic model, linear birth rates depend on the amount of existing normal matter. 
Therefore, in contrast to exponential decay, exponential birth requires a system of particles 
with interdependence, which prevents the techniques in this paper to be extended to birth 
processes in a trivial way. 

The exact choice of the microscopic transition probabilities may not influence the con- 
tinuum limit, as the limit only depends on asymptotic behaviour of the probabilities 
as h — > 0. However, this choice will affect the discrete-time approximation (I29p . In gen- 
eral, different microscopic systems can lead to different variational formulations for the 
same equation. For instance, the minimisation functional fl63j) that we derive for a system 
of diffusion-reaction equations differs from the L 2 -gradient flow in |PSV10] for that same 
equation, as the underlying microscopic model of that paper models reaction as diffusion 
in a chemical landscape. 

One of the interesting suggestions of the connection between large-deviation principles 
and gradient flows is the possibility that every gradient-flow structure might correspond 
to a large-deviation principle for some stochastic process. For instance, there is of course 
a different gradient-flow formulation for the diffusion-decay equation without drift ffTSl) . 
with driving energy 



and with the L 2 -metric as dissipation. This can be seen by using the fact that in the 
Hilbert space L 2 a gradient flow satisfies at each time t > 



which can be rewritten as a weak form of fTT8|) . Could this structure be related to a 
large-deviation principle of some stochastic process? At this point we have no idea. 

A The quenched large-deviation principle 

In this appendix we derive the large-deviation principles that are used in this paper - in 
a slightly more general context. First we state the large deviation principle of the pair 




(d t p, s) L 2 



for all s e L 2 
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empirical measure. The proof is mainly due to Leonard, but we include it here to provide 
the full details. In the following, Q will denote a (separable metric) Radon space. 



Theorem 12 ( |Leo071 Prop. 3.2]). Fix p° e P(O) and let {xi, n } i=1 n>1 C Q be so that 

1 - 

L° n ■= - — " P° asn->oo. (64) 



n 

i=l 



Let ( : Q — > V(£l) be continuous with respect to the narrow topology of V(Q), and let 
each random variable Y i n in Q be distributed by ( Xi ' n . Define the pair empirical measure 
M n := n _1 3(x in ,Y in )- Then the sequence {M n } n satisfies the large-deviation principle 
in V(Q 2 ) with rate n and rate functional: 

I( q ;p°):=i nqlp) > if7Coq = P °> (65) 
I oo, otherwise, 

withp{dxdy) := ( x (dy) p° (dx) . 

Proof. We write C b {Vt 2 ) for the space of continuous bounded functions on Q 2 , and C b (Q 2 )* 
and C b (Q 2 )' for its topological and algebraic dual respectively, the latter being the space 
of all linear functional on C b (Q 2 ) with the weakest topology that makes all these linear 
functional continuous. We equip both C b (fl 2 )* and C b {VL 2 )' with the topology induced by 
the duality with C b (Q 2 ), denoted by (•,•). Recall that the dual C b (Q)* can be identified 
with the space of finite, finitely additive, and regular signed Borel measures [DS57t Th. 
IV. 6. 2]. Moreover, since Q 2 is Radon any probability measure is reg ular. Hence V(tt 2 ) C 
C b (fl 2 )* C C b (Q 2 )', and the topologies on V(fl 2 ) and C b (Q 2 )* coincide with the induced 
topology as a subset of C b (Q 2 )'. Note, however, that C b (Q 2 )* is closed, while V(Q 2 ) is not. 

We first consider M n as random variables in C b (Q 2 )'. For an arbitrary number of 
functions <fii, . . . , (fid in C b (fl 2 ), define the new random variables: 

Z$l,...,<j> d ;n ■= ((01, M n), • • • , ((fid, M n )) 

n n 

i=l i=l 

n n 

~ L ^ ^ < / > l(-^i,n, ^i,n), ■ ■ ■ , ^ ^ . 0d(-^i,n, ^i,n.) 
i=l i=l 

First we prove the large deviation principle of Law(Z ( ^ 1)ii .^ d - n ) in using the Gartner-Ellis 
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Theorem. For any A G M. d : 

A^,...,^;n(A) := ±log(Eexp(nA ■ ^ lr ..,^ ;n )) 

(/ d n 
EeX P ( ^A X i,n, Yi,n) 

\j=l i=l 



(*) 



±log ( ] J E exp J^A^x, 

i=l \j=l 



= n it. log ^ / exp ViKn, y)j C l ' n {dy)^ 

= y log ^ exp^A^x^j C*Wj 
= | log(e A ^, C')^(^), 



(66) 



using the notation <f> x : y i-> (0i(x, y), . . . , (f>d(x, y)). In (*) we have used the independence 
of (xj in , Yj in ) to take the sum out of the expectation. 

In order to use (|6%1) to pass to the limit n — > oo in (1661) . we need to show that x i-> 
log(e A '^, C z ) is a bounded and continuous function. The boundedness follows directly 
from the fact that all <pj are bounded. To prove continuity, take any convergent sequence 
x m — > x. As ( x is continuous as a function from x G Q to T'(O), Prokhorov's Theorem 
gives tightness of the sequence ( Xm . Thus for each e > there exists a compact set K e C f2 
such that: 

C Xm (fi\^) < e for all m > 1. 

Using that the sequence of functions y H- e ^ Xm ^ converges uniformly on compact sets as 
m — > oo, we have: 

|(e A '^ m ,C m > - (e^Ol = l(e A ^ m - e^C") + (e A '^, C* - C m }\ 

< f | e ^™(w)_ e A-*-(y)|^m( d2/ ) 

+ / | e A-^(y) _ e A.^"d,)| ^m(^) + Ke^C* - C m )| 

< (||e A ^ m |Uoo (n) + ||e A ^|U°c(n)) r m (^\^ 




! ► 0. 
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Hence indeed (e A '^ , ( x ) is continuous in x, so we can apply (I64p to find the limit: 
A^,...,^(A) := lim A^,...^. n (A) = / \og(e x ^C)p°(dx). 

n— >oo J 

Since this function is continuously differentiable and finite throughout its whole domain 
(M. d ), the conditions of the Gartner-Ellis Theorem [DZ871 Th. 2.3.6c] are met, so that 
-Z^i,...,0 d ;n satisfies the large-deviation principle in R d with rate n and rate function ^ , 
the Fenchel-Legendre transform of A^...^. 

Next we apply the Dawson-Gartner Theorem |DZ87| Th. 4.6.9] to find that the sequence 
{M n } n satisfies the large- deviation principle in Ch(Q 2 )' with rate n and rate functional: 

/(g) := sup sup + q), . . . , (<f) d , q))) 

d>i ^i,...^eC(,(n 2 ) 

= sup sup sup A ■ <?},..., (4>d, q)) - A^ 1) ...^ (i (A) 

d>\ <j> 1 ,...<t>d£C b (n 2 ) \€R d 

= sup (0,g>- Aog(e^CVM, 

</- 6 C t (S] 2 ) J 

where as before we write <p x : y i— )■ 0(x, y). 

We now show that this rate functional is indeed ( 16"5]) . Since C b {VL 2 )* is a closed subset of 
C b (n 2 )' containing V(tt 2 ), we have / = oo on C b (tt 2 )'\C b (n 2 y [D%7l Th. 4.1.5]. Thus, we 
only need to consider q G C^(Q 2 ). For such q (identified with a finitely additive measure), 
we write TToq(B) := q(B x Q) for any Borel set B. 

• First, we show that I(q) = oo whenever q G C^(Q 2 ) with first marginal n q ^ p°. 
This can be seen by restricting the supremum to </>'s that depend on the first variable 
only: 

/(g) > sup (<p,q)- /log(e^,CVW 

(l>ec b (Q) J 

= sup (0, vr g) - ((f), p°) 
<t>ec b (n) 

0, if vr g = p°, 
-oo, otherwise. 

Next, we show that /(g) = oo for any g G C b {VL 2 )* that is finitely, but not countably 
additive. By the argument above, we only need to consider non-negative finitely 
additive measures with g(f2 2 ) = 1. For such g, there exists a sequence of disjoint 
measurable sets A- c Q 2 such that 



i=i 
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Without loss of generality, assume that U^Ai = Q 2 . Since q and p are regular, one 
can find for any k > 1, sequences of sets C C Oj with i£j compact and Oj 
open, such that: 

oo oo 

^g(Oi)<l-i5 and $>(M^)<e" fc . (67) 

i=i i=i 

Then for each k, n > 1 there exist a continuous function : fi 2 — > [— k, 0] such that 



on U? =1 Ki, 
0, onft 2 \U™ =1 0;. 



(f>kn(x,y) 

For these functions we have, on one hand (as 0« might not be disjoint) 

n 

(fan, q) > -k g(ur =1 0,) > -k qm, (68) 

i=l 

and on the other hand 



so that 

J log<e«», CVW < / (-* + log / (Xu^Ai + e^uju*,) C*) P°(dx) 

J ensen , , / n N v 

< —k + log (p (u^ =1 ^) + e h P [n 2 \ ur =1 if*)) • 

Finally, we find for the rate functional: 



i=l 



i=l 



(69) 



> 


lim sup 










> 


lim sup 




fc— >oo 




lim sup 




k— >oo 






> 


lim sup 




k— >oo 




lim sup 




fc— ^oo 



Now assume that q G V(£l 2 ) such that 7rog = p°. The Disintegration Theorem then 
allows us to write 

q(dxdy) = p°(dx)q x (dy) 
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for some family of measures {q x : x G Q}. In this case: 

I(q) = sup [ {(r,q x )-log(e"> x ,C))p°(dx) 

< [ sup {{<f> x ,q x )-log{e* x ,( x }}p°(dx) 
J ct>*ec b (n) 

x I /-x\ / 



«((HCV(<faO 

(log fP(dx)f{dyl if pV « pV, 

oo, otherwise 

= w(?b). 

• We conclude the proof with the inequality in the other direction. Observe that / is 
the Fenchel-Legendre transform of 

A:0^ j\og(e* x ,( x )p°(dx) 

<log J(et>\e)p°(dx) = log{et>,p), 

where the bound follows from Jensen's inequality. Hence: 

I(q) = A*(q)> sup {{hq)-log{e+,p)} = U(q\p). 

Since the large- deviation principle holds in Cb(Q 2 )* with Dj C V(Q 2 ), it also holds in 
V(tt 2 ) with the same rate functional (i.e. restricted to V{Vt 2 )) [ DZ871 Th. 4.1.5]. □ 

The following corollary follows immediately from the contraction principle: 

Corollary 13. The sequence {wT 1 Y^i=x satisfies the large deviation principle in 

V(Q) with rate n and rate function: 

j, on . = | inf 9er(P ,P) H(q\p), */ Q e r(p°, p), ^ 7Q ^ 
1 oo, otherwise. 

Remark 14. A straightforward approach would be to look for a large-deviation principle 
in the set of probability measures: 

A^F(M n e A\L° n = p°). (71) 

However, these conditional probabilities are not well-defined: the events = p } typ- 
ically have zero probability. One way to deal with this is to condition on small neigh- 
bourhoods of p° of size 5 instead, calculate the large-deviation rate functional for these 
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conditional probabilities, and then take the limit for 5 — > 0. This is the approach taken in 
[ADPZ10J. We note that because the limits n — > oo and S — > can not be interchanged, 
this approach does not a priori yield a large-deviation principle in the rigorous sense. 

In the approach that we adopt from [Leo07| . we consider fixed initial positions so that 
there is no need to define the conditional probabilities above. This technique is sometimes 
called a quenched large-deviation principle. □ 



B Estimate of the fundamental solution 

In this appendix we prove the estimate on the fundamental solution, defined in Definition HI 
that is used in the proof of Theorem We expect that this estimate is not a new result, 
but since we haven't been able to find it in the literature we include the proof here for 
completeness. 

Theorem 15. Assume \1/ G C%(M.), and let rj be the fundamental solution of Definition^ 
Then there are (3 , such that for every t > 0: 

Q\y - x ) e -|*(v)+i*(*>+A>' < rf{ y] x) < 6\y - x ) e -\*to+\n*)+fht (72) 

for almost every i,i/6l. 

Proof. For brevity we assume that x = and \]/ (0) = 0, and we omit the dependence on x. 
For f3 G K define: 

(,(y,t):=v\y)-o\y)e-b^. 

By partial integration we obtain for all < e < T and G C%' (R x [e,T]): 

T 

(9t4>(y, t) + d yy 0(y, t) - d y ty(y)d y <f>{y, <)) (p(y, t) dy dt 

T 

(j>(y,t)fp(y,t)dydt+ I <f)(y,T)(p(y,T)dy- I <p(y, e)Cp(y, e) dy (73) 



My,t) := ( -\d y *{y) + -\d y ty(y)\ 2 + ) 0*(y) e -3*M+# 



with: 

fj», t\ ■= f —If) \S/(iA -l ^ 

Because d y ^ and d y ^ are bounded, there are 0o, fteR such that: 

fp (y,t)<o<fp 1 ( y ,t). (74) 

First we exploit this inequality for f3\. Let be the solution of the adjoint problem: 

- d t <fi = dyy(f> - Oy^dylp (75) 
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with end condition: 



where H is the Heaviside function. Again by |Fri64| Th. 1.10] there exists a positive 
fundamental solution rj* and hence a positive bounded solution G C 2,1 (R x [0,T)) to 
f T75|) . However, f!73|) requires the test functions to be in C^'^R x (0,T]). To this aim we 
approximate in the following way. First, let 0^ be a sequence in C^°(R) such that 

~ ^ T weakly-* in L c 



Next, let ra G C^' (R x [0,T]) be the solution of (f75|) with approximated end condition 
0^. For this sequence ( F73|) becomes: 







JO, 

(ti.) , 



0n(y, t)fplv, t) dydt + J 4>l(y)Cpiy, T)dy- J <f> n {y, e)(^{y, e) dy 

<i>n{y, t)f/h(y> *) dy dt + / 4>l{y)Cp x {y, T) dy 

<l>(v,t)ffk(y,t)dydt+ f H ((fo (y, T)) (^(y, T) dy, (76) 



using properties (i.) and (ii.) that we will prove below. From this we infer for the positive 
part of £01 : 

US} 



0< (+(y,T)dy 



^y^fdy^ dy dt < o. 

>0 >0 

Analogously we use the other inequality from (1T4]) and conclude that for all T > 0: 

£ft(j/, T) < < £a,(z/, T) for almost every y G R, 

which proves the statement. 

We still owe the reader the proof of the two approximations in ([76]) . 

(i.) The argument follows from C$ii x : e ) ~^ weakly in L X (R) as e — > 0. Then for any 
fixed n: 



(<f>n(y, e) - (f>n(y, 0)) C/3i(Z/, e) % 



< e 



d t (f> n (y,t) dtC^(y,e) dy 
&<Mz,°°(Rx[0,T]) / C/3i(y, e) dy 



e^0 



->■ 0. 



bounded 



Hence: 

<t>n{y, e)£ft(y, e) 



(0 n (y, e) - n (y, 0)) e) dy + y n (y, 0)^(3/, 



e-S-0 



-»■ 0. 
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(ii.) For the second convergence in (1761) . we can assume that the approximation of the 
end condition satisfies: 



< <f%(y) < <f(y) for all y eR. 

Therefore: 

\<l>n{y,t)fpiy,t)\ < \(j)(y,t)f^(y,t)\ < ||0 T || L ^ (M) |/ /3l (?/,t)| . 

V v ' 

GL^RxCO.T)) 

Since for the fundamental solution 77* of the adjoint problem f!75|) there holds z 1— > 
r]*\y, z) e ^(R), we have: 

<f>n{y,t)= r]* t (y,z)^l(z)dz > j r]*\y,z)<f(z)dz = <p{y,t) 

pointwise. The Lebesgue Dominated Convergence Theorem then gives 

n— too 

□ 
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